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ABSTRACT 

Although many severe acute respiratory syndrome-like coronaviruses (SARS-like CoVs) have been identified in bats in China, 
Europe, and Africa, most have a genetic organization significantly distinct from human/civet SARS Co Vs in the receptor-binding 
domain (RBD), which mediates receptor binding and determines the host spectrum, resulting in their failure to cause human 
infections and making them unlikely progenitors of human/civet SARS CoVs. Here, a viral metagenomic analysis of 268 bat rec¬ 
tal swabs collected from four counties in Yunnan Province has identified hundreds of sequences relating to alpha- and betacoro- 
naviruses. Phylogenetic analysis based on a conserved region of the RNA-dependent RNA polymerase gene revealed that alphac¬ 
oronaviruses had diversities with some obvious differences from those reported previously. Full genomic analysis of a new 
SARS-like CoV from Baoshan (LYRal 1) showed that it was 29,805 nucleotides (nt) in length with 13 open reading frames 
(ORFs), sharing 91% nucleotide identity with human/civet SARS CoVs and the most recently reported SARS-like CoV Rs3367, 
while sharing 89% with other bat SARS-like CoVs. Notably, it showed the highest sequence identity with the S gene of SARS 
CoVs and Rs3367, especially in the RBD region. Antigenic analysis showed that the SI domain of FYRall could be efficiently 
recognized by SARS-convalescent human serum, indicating that FYRal 1 is a novel virus antigenically close to SARS CoV. Re¬ 
combination analyses indicate that FYRal 1 is likely a recombinant descended from parental lineages that had evolved into a 
number of bat SARS-like CoVs. 


IMPORTANCE 

Although many severe acute respiratory syndrome-like coronaviruses (SARS-like CoVs) have been discovered in bats worldwide, 
there are significant different genic structures, particularly in the SI domain, which are responsible for host tropism determina¬ 
tion, between bat SARS-like CoVs and human SARS CoVs, indicating that most reported bat SARS-like CoVs are not the progen¬ 
itors of human SARS CoV. We have identified diverse alphacoronaviruses and a close relative (LYRall) to SARS CoV in bats col¬ 
lected in Yunnan, China. Further analysis showed that alpha- and betacoronaviruses have different circulation and transmission 
dynamics in bat populations. Notably, full genomic sequencing and antigenic study demonstrated that FYRall is phylogeneti- 
cally and antigenically closely related to SARS CoV. Recombination analyses indicate that LYRal 1 is a recombinant from certain 
bat SARS-like CoVs circulating in Yunnan Province. 


C oronaviruses (CoVs) in the subfamily Coronavirinae are im¬ 
portant pathogens of mammalian and avian animals and cur¬ 
rently compose four genera: Alphacoronavirus, Betacoronavirus, 
Gammacoronavirus, and Deltacoronavirus (1). Members of Alpha¬ 
coronavirus and Betacoronavirus are found exclusively in mam¬ 
mals, e.g., human CoV 229E, NL63, and OC43, and cause human 
respiratory diseases (2). A CoV is also the causative agent of severe 
acute respiratory syndrome (SARS), the first global human pan¬ 
demic disease of the 21st century, which spread to 30 countries in 
five continents, resulting in >8,000 human cases with 774 deaths 
(3, 4). SARS CoV is a member of the Betacoronavirus genus and is 
largely distinct from previously known human CoVs OC43 and 
229E (5-7). To identify the transmission source of SARS, large- 
scale animal screening was implemented in May 2003, and several 
strains of SARS CoVs were isolated from nasal and/or fecal swabs 
of six masked palm civets ( Paguma larvata) and one raccoon dog 
(.Nyctereutes procyonoides) collected from a wet market in Shen¬ 
zhen retailing wild animals for exotic foods (8). Their full genome 


sequences were 99.8% identical to that of human SARS CoV, and 
therefore civets were deemed to be an animal reservoir of this virus 
(8). Further serological studies over a larger area revealed that only 
civets in the market were SARS seropositive, while farmed civets 
were seronegative, indicating that civets likely became infected 
from an unknown source in wet markets, not in the farming en¬ 
vironment (9). Moreover, a comprehensive analysis of cross-host 
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FIG 1 (A) Geo-distribution of bat CoVs in China (gray provinces); (B) locations of bat alphacoronaviruses (open circles) and SARS-like CoVs identified in our 
study (solid circle) and in Ge et al.’s study (open triangle) (29). a, alphacoronavirus; (3, betacoronavirus; S, SARS-like CoV. 


evolution between SARS CoVs in civets and humans indicated 
that civets might be spillover animals rather than the natural hosts 
of SARS CoV (10). In 2005, SARS-like CoVs sharing 87 to 92% 
nucleotide (nt) identity with SARS CoVs were identified in horse¬ 
shoe bats (11, 12). These studies provided the first evidence that 
bats were the natural hosts of SARS CoVs. Since then, more SARS- 


like CoVs have been reported in several insect bat species in China, 
Europe, and Africa, but none have genomes identical to SARS 
CoVs ( 13-21 ). In particular, in these viruses, the key S1 domain of 
the S gene, responsible for receptor binding and determining host 
tropisms (22, 23), shared a sequence identity as low as 76 to 78% 
with SARS CoVs and had a deletion of 19 amino acids (aa) in the 


TABLE 1 Details of rectal swabs from bats and positive number of bats detected by nested RT-PCR 


Organism 

Xiangyun 


Bingchuan 


Jinghong 


Baoshan 


Total 


No. 

No. (%) 
positive* 

Clade 6 

No. 

No. (%) 
positive* 

Clade 6 

No. 

No. (%) 
positive* 

Clade 6 

No. 

No. (%) 
positive* 

Clade 6 

No. 

No. (%) 
positive* 

Clade 6 

Rhinolophus ferrumequinum 

15 

0 


32 

2(9) 

aC 

30 

1(3) 

aC 




77 

3(4) 

aC 

Rhinolophus afjinis 










11 

2(18) 

P 

11 

2(18) 

P 

Rhinolophus hipposideros 




4 



4 

1(25) 

aC 

3 



11 

1 (9) 

aC 

Myotis daubentonii 

22 

2(9) 

aA/E 

64 

5(8) 

aA/D/E 







86 

7(8) 

aA/D/E 

Myotis davidii 

83 

11(13) 

aA/B/D/E 










83 

11(13) 

aA/B/D/E 

Total 

120 

13(11) 

aA/B/D/E 

100 

7(7) 

aA/C/D/E 

34 

2(6) 

aC 

14 

2(14) 

P 

268 

24 (9) 

ctA/B/C/D/E/|3 


* By nested RT-PCR. 

b Clade the amplicons clustered into, a, Alphacoronavirus ; (3, Betacoronavirus’, A, myotis bat coronavirus 5; B, miniopterus bat coronavirus 1; C, hipposideros bat coronavirus 
HKUlO-like; D, myotis bat coronavirus HKU6-like; E, myotis bat coronavirus 4. 
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Iridoviridae 17[“ 
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Root 730,668 


uses 32,335j 


Pt ycodnaviridae 309 


Pol’ dnaviridae 3,632« 


Retroviridae 2,809. 


ssRNA viruses 3,572 


L 


ssRNA positive-strand viruses, no DNA stage 3,448 


-OAlphabaculovirus 44 
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—• Myoviridae 6 
-o Podoviridae 20 

Alloherpesviridae 3,326 
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Ranavirus 8 
-• Unclassified Iridoviridae 6 
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-0 Polyomavirus 78 
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O Coronavirinae 216 
Picornavirales 2,889 


FIG 2 Taxonomic summary of viral reads with BLASTn (E < 10 5 ) results exhibited in MEGAN 4. The number of reads in each taxonomic level is shown after 
the level name. 


S gene receptor binding domain (RBD), which mediates human 
infection via binding to human angiotensin-converting enzyme 2 
(ACE2) (11, 12,24). Such key differences in the S gene between bat 
SARS-like CoVs and SARS CoVs determined their different host 
spectrums and made them unable to infect human and civets (25— 
27). Clearly, these known bat SARS-like CoVs are not the pro¬ 
genitors of human/civet SARS CoVs, and there remains to be 
identified an intermediate virus to bridge bat to human/civet 
transmission (24, 28). Recently, however, a novel SARS-like CoV 
(strain Rs3367) has been described which, so far, is more closely 
related to SARS CoVs than any previously reported bat SARS-like 
CoVs. Most importantly, it has been shown to use ACE2 receptor 
for cell entry, suggesting that it can cause direct human infection 
without an intermediate host (29). Here, we report another novel 
SARS-like CoV (LYRal 1) identified from Rhinolophus affinis col¬ 
lected in Yunnan Province of China, which has high nucleotide 
and amino acid identities in its genome, similar to those of 
Rs3367, particularly in the RBD region. In addition, several clades 
of new alphacoronaviruses have been identified in Rhinolophus 
and Myotis spp. 

MATERIALS AND METHODS 

Ethics statement. The procedures for sampling of bats in this study were 
reviewed and approved by the Administrative Committee on Animal 
Welfare of the Institute of Military Veterinary, Academy of Military Med¬ 
ical Sciences, China (Laboratory Animal Care and Use Committee Autho¬ 
rization, permit number JSY-DW-2010-02). All live bats were maintained 
and handled according to the Principles and Guidelines for Laboratory 
Animal Medicine (2006), Ministry of Science and Technology, China. 


Sample collection and preparation. In total, 268 adult bats were live 
captured with nets in 2011 in 4 counties/prefectures of Yunnan Province 
(Fig. 1). Within each county there was either a single sampling location or 
two adjacent sites. Bat details are shown in Table 1. All specimens were 
collected rectally using sterile swabs and immediately transferred to viral 
transport medium (Earle’s balanced salt solution, 0.2% sodium bicarbon¬ 
ate, 0.5% bovine serum albumin, 18 p,g/liter amikacin, 200 p,g/liter van¬ 
comycin, 160 U/liter nystatin) and stored in liquid nitrogen prior to trans¬ 
portation to the laboratory, where they were stored at — 80°C. All captured 
bats were released after sample collection. 

Metagenomic analysis and RT-PCR screening. All specimens were 
pooled and subjected to viral metagenomic analysis as per our published 
method, using barcode primers for differentiation of sample species and 
locations (30). All sequences generated in one lane by Solexa sequencing 
(BGI) were subjected to BLASTn searches (http://blast.ncbi.nlm.nih.gov 
/Blast.cgi) against the nonredundant database of GenBank, and all se¬ 
quences with an E value of < 10~ 5 were imported into MetaGenome An¬ 
alyzer v.4 (MEGAN) to determine their taxonomic classification (30). 
Sequences assigned to CoVs were used for further analysis. Nested reverse 
transcription (RT)-PCR primers targeting a 440-bp fragment of the RNA- 
dependent RNA polymerase (RdRp) gene were synthesized based on pre¬ 
vious publications (31, 32). Total RNA of each rectal swab was extracted 
automatically using the RNeasy minikit (Qiagen) in a QIAcube (Qiagen). 
Reverse transcription was effected with the 1st cDNA synthesis kit (Ta- 
KaRa) according to the manufacturer’s protocol. The cDNA was ampli¬ 
fied using the PCR master mix (Tiangen) with the following PCR pro¬ 
grams: 30 cycles (outer PCR) or 35 cycles (inner PCR) of denaturation at 
94°C for 30 s, annealing at 54°C for 30 s, and extending at 72°C for 40 s, 
with double-distilled water (ddH 2 0) as a negative control. Positive PCR 
amplicons were ligated into pMD18T vector (TaKaRa) and used to trans¬ 
fect DH5a competent Escherichia coli (Tiangen). Six clones of each am- 
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A BtCoV XYMd3 
A BtCoV XYMd23 
A BtCoV XYMd31 
A BtCoV XYMd82 
A BtCoV XYMdal 1 
A BtCoV XYMd77 
A BtCoV XYMd45 
A BtCoV XYMd41 
A BtCoV BCMda57 
A BtCoV XYMd68 
- A BtCoV BCMda22 
■ A NC 009657 B 512 
i A DQ648823 B A527 
GU937797 PEDV 
A DQ249224 B HKU6 
A DQ648831 B A633 
A BtCoV XYMd39 
BtCoV BCMda9 
A BtCoV BCMda55 
r- A AB683971 B 2231 

94,-A JQ989273 B HKU10 
A BtCoV BCRf4 
BtCoV JHRfl 1 
A DQ648838 B 860 
k BtCoV JHRh4 
k BtCoV BCRf8 
A HQ184050 B Mbly B Spain 
' BtCoV BCMda38 
BtCoV XYMda4 
BtCoV XYMd70 
EF544565 B 48RM 
A EF544567 B 3RM 
A EU834951 B 034 

A EF544566 B 65RM 
A NC 009988 B HKU2 
JX504050 H NL63 
JX503061 H 229E 
A EU420138 B1A 
A EU420137 BIB 
BtCoV XYMd6 
A EU834952 B 088 
A DQ249226 B HKU7 
A EU420139 B HKU8 AFCD77 
A DQ249228 B HKU8 
GQ477367 Ca 14711336 
AY994055 F FIFV WSU79 

96 r- U00735 Bo Mebus 
005147 HOC43 
006852 MJHM 
HKU1 

NC 009021 B HKU9 
98 , KF192507 H MERS 
JX869059 H 2c EMC 
A DQ249221 B HKU5 

48 ‘-A DQ249214 B HKU4 

A FJ710043 B GhanaBoo/348 
A HQ166910 BZBCoV 
A GUI 90215 B BM48-31 
» r A GQ153548 B HKU3 
A DQ648795 BA1018 
78 LADQ412043 BRm 
■ A DQ648856 B 273 
ggA A DQ412042 B Ftfl 
A FJ588686 B Rs672 
A DQ071615 B Rp3 
A Bt SARS CoV LYRa3 
A Bt SARS CoV LYRall 
FJ882959 HMA15 
A KC881006 B 3367 
FJ882930 H ExoNI 
AY613949 CPC4 
AY313906 HGD69 



*Myotis bat coronavirus 4 


Scotophi/us bat coronavirus 512 
Porcine epidemic diarrhea virus 


‘Myotis bat coronavirus HKU6-like 


‘Flipposideros bat coronavirus HKUIO-like 


Alpha corona virus 


‘Myotis bat coronavirus 5 


‘Myotis bat coronavirus 1 
‘Myotis bat coronavirus 2 
‘Myotis bat coronavirus 3 
‘Eptesicus bat coronavirus 
Rhinoiophus bat coronavirus HKU2 
Human coronavirus NL63 
Human coronavirus 229E 

Miniopterus bat coronavirus 1 


‘Miniopterus bat coronavirus HKU7-like 
Miniopterus bat coronavirus HKU8 


Atphacoronavirus 1 

Betacoronavirus 1 

Murine coronavirus 
Human coronavirus HKU1 
Rousettus bat coronavirus HKU9 

‘MERS coronavirus 

Pipisteriius bat coronavirus HKU5 
Tyionyteris bat coronavirus HKU4 

‘Africa bat coronavirus 
‘Bulgaria bat coronavirus 


Betacorona virus 


SARS-retated coronavirus 


0.2 

FIG 3 Phylogenetic analysis of RdRp amplicons obtained in this study and representatives of species in genera Alphacoronavirus and Betacoronavirus based on 
the maximum likelihood method. All sequences were classified into two groups: group Alphacoronavirus comprising 17 clades, and group Betacoronavirus 
comprising 10 clades. Clades containing approved species are in italics; clades containing unapproved novel species are marked with an asterisk. All amplicons 
in this study are marked as filled triangles, with previously reported bat CoVs as open triangles. Middle letters identify the viral host: H, human; C, civet; B, bat; 
Bo, bovine; M, murine; Ca, canine; F, feline. 
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TABLE 2 Comparison of full genomic lengths and ORF amino acid identities of SARS and SARS-like CoVs“ 


FL or 

LYRal 1 

Tor2 


Rs3367 


Rfl 


Rp3 


ORF 

length (aa) 

Length (aa) 

% aa identity 

Length (aa) 

% aa identity 

Length (aa) 

% aa identity 

Length (aa) 

% aa identity 

FL (in nt) 
la 

29,805 

4,382 

29,751 

4,377 

95.1 

29,792 

4,382 

95.1 

29,709 

4,377 

93.9 

29,736 

4,380 

95.4 

lb 

2,628 

2,641 

98.9 

2,628 

99.0 

2,628 

98.6 

2,628 

98.9 

S 

1,259 

1,255 

89.6 

1,256 

89.9 

1,241 

79.0 

1,241 

81.1 

3 

274 

274 

91.3 

274 

91.6 

274 

81.5 

274 

89.1 

4 

NP 

154 

NA 

114 

NA 

114 

NA 

NP 

NA 

E 

76 

76 

98.7 

76 

98.7 

76 

94.8 

76 

98.7 

M 

221 

221 

97.7 

221 

95.2 

221 

95.5 

221 

95.9 

7 

63 

63 

95.3 

63 

91.7 

63 

89.1 

63 

87.5 

8 

122 

122 

94.3 

122 

94.1 

122 

91.1 

122 

93.5 

9 

44 

44 

91.1 

44 

90.5 

44 

93.3 

44 

91.9 

10 

NP 

39 

NA 

NP 

NA 

122 

NA 

NP 

NA 

11 

NP 

84 

NA 

NP 

NA 

NP 

NA 

NP 

NA 

10b 

121 

NP 

NA 

121 

81.1 

NP 

NA 

121 

79.5 

N 

422 

422 

97.9 

422 

97.8 

421 

95.5 

421 

97.9 

13 

98 

98 

96.0 

98 

93.7 

97 

82.7 

97 

86.7 

14 

70 

70 

94.4 

70 

92.6 

70 

84.5 

70 

91.5 


" The accession numbers of Tor2, Rs3367, Rfl, and Rp3 are AY274119, KC881006, DQ412042, and DQ071615, respectively; FL, full genome sequence (nt); % aa identity shows 
amino acid sequence identity with LYRal 1; NP, not present; NA, not available. The highest identities are shaded. 


plicon were randomly picked for sequencing by the Sanger method in an 
ABI 3730 sequencer (Invitrogen). All strains in this study were named 
according to the following rules: the first two letters represent the sam¬ 
pling location, with the remaining letters identifying the host species and 
numbers referring to the sampling order. 

Full genome sequencing. To obtain the full genome of LYRal 1, 16 
degenerate PCR primer pairs were designed using GeneFisher, based on 
human/civet SARS CoV and bat SARS-like CoV sequences available in 
GenBank, targeting almost the full length of the genome (sequences avail¬ 
able upon request). For amplifying the terminal ends, 3' and 5' rapid 
amplification of cDNA ends (RACE) kits (TaKaRa) were employed. Viral 
cDNA was prepared as described above directly from positive samples and 
amplified using the Fast HiFidelity PCR kit (Tiangen). The amplicons 
were sequenced after blunt ligation into pZeroBack vector (Tiangen). 
Overlapping amplicons were assembled with SeqMan v.7.0 into full 
genomic sequences. Open reading frames (ORFs) of LYRal 1 were deter¬ 
mined by Vector NTI v.8, followed by comparison with those of other 
SARS CoVs and bat SARS-like CoVs. 

Phylogenetic analysis of amplicons. All 440-bp-long amplicons were 
aligned with their closest phylogenetic neighbors in GenBank using Clust- 
alW v.2.0. Representatives of different species in the genera Alphacorona- 
virus and Betacoronavirus as well as some unapproved species were in¬ 
cluded in the alignment. Phylogenetic and molecular evolutionary 
analyses were constructed by the maximum likelihood method using 
MEGA v.6 with the Tamura-Nei substitution model and a bootstrap value 
of 1,000 (33). 

Morphological observation by electron microscopy. The positive 
swab was examined for viral particles of LYRal 1 as per our previous de¬ 
scription (34). Briefly, 100-pl swab suspensions were centrifuged at 
120,000 X g for 3 h in an SW55Ti rotor (Beckman), and the resulting 
pellets were resuspended in 20 pi SM buffer (50 mM Tris, 10 mM MgS0 4 , 
0.1 M NaCl, pH 7.5) and directly negatively stained with 2% phospho- 
tungstic acid for observation with a JEM-1200 EXII transmission electron 
microscope (JEOL). 

SI expression and antigenicity assay. To characterize the antigenic 
reactivity of S proteins of bat SARS-like CoVs with human SARS CoV 
antibody, SI fragments of human SARS CoV BJ01 (AY278488) and bat 
SARS-like CoVs LYRal 1 and Rp3 (DQ071615) were expressed as fusion 
proteins with enhanced green fluorescent protein (EGFP) in BHK-21 cells 
and subjected to Western blot analysis using human convalescent-phase 


serum from a SARS patient in 2003. Briefly, the SI fragment of SARS CoV 
BJ01 (nt 3 to 2028 of the S gene) was amplified from pcDNA3.1-S. The 
corresponding SI fragments of LYRal 1 and Rp3 were amplified from the 
above-described cDNA and commercially synthesized (GenScript). Three 
S1 fragments were inserted into pEGFP-C 1 (Clontech) between Xhol and 
BamH I restriction sites to construct three SI expressing plasmids, 
pEGFP-BJ, pEGFP-LY, and pEGFP-Rp3. These three plasmids, along with 
pEGFP-Cl (as a control), were transiently expressed in BHK-21 cells us¬ 
ing FuGENE HD transfection reagent (Promega). Total proteins were 
harvested 24 h posttransfection with M-PER mammalian protein extrac¬ 
tion reagent (Thermo Scientific), and concentration was measured by the 
BCA protein assay kit (Tiandz). A total of 20 p-g total protein was boiled in 
2X protein loading buffer (Tiangen) for 10 min, separated on 10% SDS- 
PAGE, and transferred onto a nitrocellulose membrane (Millipore). The 
blocked membrane was then incubated with primary antibody mixture 
(SARS-convalescent human serum, rabbit anti-EGFP antibody [Beyo- 
time], and 5% skimmed milk [vol/vol/vol = 1:1:1,000]) at 4°C overnight 
followed by a secondary antibody mixture (peroxidase-conjugated mouse 
anti-human antibody [ZSGB-Bio], IRDye 800CW goat anti-rabbit sec¬ 
ondary antibody [LI-COR Biosciences], and 5% skimmed milk [vol/vol/ 
vol = 3:5:15,000]) at room temperature for 2 h. The washed membrane 
was then scanned in an Odyssey infrared imaging system (LI-COR Bio¬ 
sciences) at 700-nm and 800-nm wavelengths to detect EGFP protein and 
then reacted with SuperSignal West Pico chemiluminescent substrate 
(Thermo Scientific) and scanned using LAS-4000 Image Reader (Fujifilm) 
to detect SI protein. 

Recombination analysis. To detect possible recombination between 
SARS and SARS-like CoVs, the full-length genomic sequence of LYRal 1 
was aligned with selected human/civet SARS CoVs (Tor2, AY274119; 
BJ01, AY278488; SZ3, AY304486) and bat SARS-like CoVs (Rp3, 
DQ071615; Rfl, DQ412042; Rs672, FJ588686; Rml, DQ412043; Rs3367, 
KC881006; B41, DQ084199; B24, DQ022305; Yunnan2011, JX993988; 
and HKU3, GQ153542) using ClustalW v.2.0. The aligned sequences were 
initially scanned for recombinational events using the Recombination 
Detection Program (RDP; version 4) with MaxChi and Chimaera meth¬ 
ods using 0.6 and 0.05 fractions of variable sites per window, respectively 
(35, 36). The potential recombination events between LYRal 1, Rs3367, 
Yunnan2011, and Rfl suggested by RDP with strong P values (<10~ 20 ) 
were investigated further by similarity plot and bootscan analyses using 
SimPlot v.3.5.1 (35-37). Maximum likelihood trees of four genomic re- 
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I-A KC881005 B RsSHC014 

L A KC881006 RS3367 
AY613949 C PC4 
AY568539 H GZ 
AY515512 C HCSZ 
AY304486 C SZ3 
AY304488 C SZ16 
AY390556 H GZ02 
AY278488 H BJ01 
AY502924 H TW11 
AY297028 H ZJ01 
AY313906 H GD69 
AY274119 H Tor2 
■ AY559084 H Sin3765 
74, A DQ084199 B B41 
A DQ022305 B B24 
A GQ153548 B HKU3 
A DQ071615 B Rp3 

— A DQ412043 B Rm 

— A FJ588686 B Rs672 

- A JX993988 B Yunnan2011 
-A DQ412042 B Rfl 

- A JX993987 B Shaanxi2011 
-A GU190215 B BM48 


B 


■n n / 


A KC881006 Rs3367 
B LYRal 1 
B LYRa3 

AY515512 C HCSZ 
AY613949 CP C4 
AY568539 H GZ 
n AY304486 C SZ3 
^ AY304488 C SZ16 
AY502924 H TW11 
AY390556 H GZ02 
AY297028 H ZJ01 
H AY278488 H BJ01 
AY559084 H Sin3765 
AY274119 H Tor2 
AY313906 H GD69 
— A KC881005 B RsSHC014 

-A GU190215 B BM48 

A FJ588686 B Rs672 


34 

27 


4j- i 

Ji—A DQ071615 B Rp3 
1— A JX993988 B Yunnan2011 
A JX993987 B Shaanxi2011 
A DQ412043 B Rm 
A DQ022305 B B24 
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0.05 


0.05 


426 


436 


446 


456 


466 
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486 
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I 
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gions generated by four breakpoints were constructed to illustrate the 
phylogenetic origin of parental regions. The breakpoint nucleotide loca¬ 
tions are based on the LYRal 1 genome. 

Nucleotide sequence accession numbers. The raw data of Solexa se¬ 
quencing have been deposited in Short Reads Archives (SRA) under ac¬ 
cession number SRA100822. All amplicon sequences, the S gene of LYRa3, 
and the full genome of LYRal 1 generated in this study have been depos¬ 
ited in GenBank under accession numbers KF569973 to KF569997. All 
accession numbers of sequences from GenBank used in this study are 
shown in the figures. 

RESULTS 

Viral metagenomic analysis. After Solexa sequencing and read 
annotations, a total of 730,668 useful reads with an average length 


of 141 nt were generated, and 32,335 of them (4.43%) were noted 
to viruses, including double-stranded DNA (dsDNA), dsRNA, 
and single-stranded RNA (ssRNA) viruses of mammalian, plant, 
insect, or bacterial origin (Fig. 2). 

Alphacoronavirus in bats. Of 216 coronavirus-related se¬ 
quences, 177 matched to the helicase gene of alphacoronavirus, 
with 70% nucleotide identities. Pan-CoV RT-PCR screening 
showed that 11% (13/120) of bats from Xiangyun, 7% (7/100) 
from Bingchuan, and 6% (2/34) from Jinghong were alphacoro¬ 
navirus positive (Table 1). Although six amplicon clones of each 
sample were randomly chosen for sequencing, they showed al¬ 
most 100% nucleotide identities, indicating that each sample car¬ 
ried only one CoV variant. All amplicons and their closest phylo- 
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FIG 5 (A) Expression ofEGFP-Sl fusion proteins in BE1K-21 cells; (B) Western blot of expressed EGFP-S1 fusion proteins using rabbit anti-EGFP antibody (left) 
and SARS-convalescent human serum (right). The molecular masses are given on the right. BJ, LY, Rp3, and E, respectively, represent EGFP-S1 proteins of SARS 
CoV BJ01, bat SARS-like CoVs LYRal 1 and Rp3, and EGFP control. 


genetic neighbors from GenBank, along with representatives of 8 
approved and several unclassified species in Alphacoronavirus (1), 
were aligned. As shown in Fig. 3, 22 amplicons grouped into five 
clades with 63 to 79% nucleotide identities between them and 
shared 80 to 91% identities with the viruses from Hong Kong, 
Guangdong, and Hainan in China, as well as from Spain (32, 38- 
40). Despite no individual carrying more than one clade, coinfec¬ 
tion with different alphacoronaviruses did exist within a bat pop¬ 
ulation in one location. 

Betacoronavirus. The remaining 39 reads were annotated to 
ORF3 of SARS CoV with >91% nucleotide identities. Results of 
RT-PCR screening showed that 2/11 (18%) Rhitiolophus affinis 
bats from Baoshan were positive for SARS-like CoVs, sharing 
98.4% nucleotide identity in the RdRp gene with bat SARS-like 
CoV Rp3 which was detected in Rhitiolophus pearsonii in Guangxi 
(12). These two amplicons shared 100% nucleotide identity 
(Fig. 3). 

Full genomic sequence comparison. The complete genome of 
bat SARS-like CoV LYRal 1 (KF569996) and the entire S gene of 
LYRa3 (KF569997) were obtained by sequencing several overlap¬ 
ping amplicons. The nucleotide identity of their complete S genes 
was 99%. The full genome of LYRal 1 contained 29,805 nt, slightly 
larger than that of SARS CoVs and other bat SARS-like CoVs. It 
had 40.7% G+C content and the same 13 ORFs as strain Rp3 
(Table 2). The full genome of LYRal 1 shared ~91% nucleotide 
identity with those of SARS CoVs and the most recently reported 
SARS-like CoV Rs3367 (29), slightly higher than the highest iden¬ 
tity with other bat SARS-like CoVs published previously (89%). 


LYRal 1 ORFs were compared with human SARS CoV (Tor2) and 
three bat SARS-like CoVs (Rs3367, Rfl, and Rp3) (7, 12, 29). 
Table 2 shows that LYRal 1 is more closely related to Tor2 and 
Rs3367 than to Rfl and Rp3. In particular, its S gene shares >89% 
amino acid identity with Tor2 and Rs3367, significantly higher 
than ~80% amino acid identity with Rfl and Rp3. However, 
ORF4 is absent from LYRal 1, while it is present in Tor2 and 
Rs3367. 

Genetic and antigenic characterization of the SI domain. The 

S gene encodes a spike protein which is a type I transmembrane, 
class I fusion protein and composed mainly of distinct N-terminal 
(SI) and conserved C-terminal (S2) domains. The SI domain 
contains the receptor binding domain (RBD), which mediates re¬ 
ceptor binding of the virus to host cells and determines the host 
spectrum (2). Comparative analysis showed that the SI amino 
acid sequence of LYRal 1 shared high identity (83.3 to 84.0%) with 
those of human/civet viruses and Rs3367 but low identity (62.4 to 
66.6%) with those of other bat SARS-like CoVs (Fig. 4A). Bat 
SARS-like CoV strain BM48, identified in Rhinolophus blasii from 
Bulgaria, was significantly distinct (15), sharing 63.6 to 65.0% 
identity with other bat SARS-like CoVs (Fig. 4A). An RBD amino 
acid sequence comparison of LYRal 1 with human/civet viruses 
and bat SARS-like CoVs showed that LYRal 1 shares 92.5 to 94.6% 
identity with human/civet SARS CoVs and 95.1% with Rs3367. In 
contrast, other bat SARS-like CoVs, including BM48, share 58.7 to 
61.3% amino acid identities with human/civet viruses (Fig. 4B). 
Further alignment of amino acid sequences of the entire receptor 
binding motif (RBM), a core part of the RBD, showed a close 
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genetic relationship of LYRal 1 to SARS CoVs and Rs3367 but a 
much less close relationship with other bat SARS-like viruses (Fig. 
4C). European bat SARS-like CoV BM48 has a 4-residue deletion 
(aa 433 to 436) and differs considerably in amino acid composi¬ 
tion from the RBM of human/civet and other bat viruses, while 
previously reported bat viruses have 17- or 18-residue deletions 
(aa 433 to 437, 457 to 468, and 472). In contrast, LYRal 1 and 
Rs3367 have no deletion and have almost completely the same 
sequence as SARS CoVs. Of the 2 critical residues in RBM that play 


key roles in receptor recognition and enhancement of receptor 
binding (24, 41, 42), only 1 mutation, T487N, was observed in 
LYRal 1 and Rs3367 compared with SARS CoVs (Fig. 4C). 

Based on the results described above, to further characterize 
antigenic reactivity of LYRal 1 with SARS CoV-specific antibody 
in comparison to that of SARS CoV BJ01 and the representative 
bat SARS-like CoV Rp3, SI proteins of these three viruses were 
successfully expressed in BHK-21 cells (Fig. 5A) and then sub¬ 
jected to Western blot analysis (Fig. 5B). In Western blotting, 
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FIG 7 CoV-like particle considered to be LYRall. 

anti-EGFP antibody detected three EGFP-S1 proteins (104 kDa) 
as well as the EGFP control (27 kDa), indicating correct expression 
and effective transfer of the proteins to the membrane, while SARS- 
convalescent human serum reacted specifically with EGFP-S1 pro¬ 
teins of BJ01 and LYRal 1, but not with those of Rp3 and the EGFP 
control. These results indicate that LYRall is antigenically more 
closely related to SARS CoV than the representative bat SARS-like 
CoV Rp3. 

Recombination analysis. Due to its unique mechanism of 
RNA replication, the CoV genome has high-frequency RNA re¬ 
combination between different strains (43). The potential recom¬ 
bination events between LYRall and the other 12 human/civet 
and bat SARS-like CoVs were initially predicted using the RDP 
program. Results showed that several fragments of LYRall were 
potential recombinants from Rs3367 and Yunnan2011 when 
LYRal 1 was set as a query, and four breakpoints were detected in 
the LYRal 1 genome, generating three recombinational fragments 
(Fig. 6B). Detailed analysis of LYRall, Rs3367, Yunnan2011, and 
Rfl using similarity plot and bootscan analysis of SimPlot sup¬ 
ported the above-given prediction and generated three recombi¬ 
nant fragments covering nt 20968 to 23443 (fragment 1, including 
partial nspl6 and the entire SI domain), nt 23444 to 24643 (frag¬ 
ment 2, partial S2 domain), and nt 26143 to the end (fragment 3, 
including the entire ORF E, M, 7, 8, 9, 10b, N) (Fig. 6A to C). 
Phylogenetic analyses based on these parental regions suggested 
that fragment 1 of LYRall was recombinant from lineages that 
had ultimately evolved into Rs3367 (Fig. 6D), while fragments 2 
and 3 of LYRal 1 were recombinants from lineages ofYunnan2011 
(Fig. 6E and F). 

Morphological observation. Pellets of ultracentrifuged rectal 
material were resuspended in SM buffer and examined by trans¬ 
mission electron microscopy (TEM). Three spherical enveloped 
viruslike particles of about 130 nm in diameter were observed, 
each in a separate field of vision. Surface spikes were apparent, but 
not with the typical coronavirus morphology (Fig. 7). To justify 
considering these as coronaviruses, therefore, the sample was sub¬ 
jected to RT-PCR for detection of CoV, respirovirus, morbillivi- 
rus, henipavirus, avulavirus, rubulavirus, and pneumovirus in 
Paramyxoviridae and influenza virus A in Orthomyxoviridae using 
published methods (44, 45). Results showed that the sample was 
positive only for coronavirus. 

DISCUSSION 

Following identification of the first bat CoV in 2005 (11, 12), 
further CoVs have been discovered in different bat species within 


China (summarized in Table 3 and Fig. 1). To date, CoVs have 
been found in 20 bat species within 4 families from 13 provinces 
and Hong Kong (11-14, 16, 20, 29, 38, 40). Among these bat 
species, 10 were in the family Vespertilionidae, 8 in Rhinolophidae, 
with one in each of Molossidae and Pteropodidae, suggesting that 
Vespertilionidae and Rhinolophidae comprise the main hosts of 
CoVs. Within the above-named families, the genera Miniopterus 
and Myotis were found to harbor only alphacoronaviruses, while 
bats from the genera Pipistrellus, Tylonycteris, and Rhinolophus 
harbored both alpha- and betacoronaviruses. Table 3 also shows 
that alphacoronaviruses have a wider host range and show greater 
genetic diversity in bats than betacoronaviruses. In addition to 
China, countries reporting bat alphacoronaviruses include Japan 
(46), the United States (47), Spain (32), Germany (48), and Ghana 
(21). Studies have shown that natural infection of various bats 
with various alphacoronaviruses is globally distributed, and bats 
are susceptible hosts of alphacoronaviruses. In addition, bats can 
also harbor diverse betacoronaviruses. According to the 9th Re¬ 
port of ICTV, since the first betacoronaviruses, i.e., SARS-like 
CoVs, were identified in bats, there have been 4 bat betacoronavi- 
rus species identified within the Betacoronavirus genus (1). More 
recently, some viruses related to Middle East respiratory syn¬ 
drome (MERS) CoV have been discovered in different bat species 
in South Africa, Ghana, and Saudi Arabia (49-51). It is apparent 
that more betacoronaviruses will be identified in bat populations, 
although not as abundantly as alphacoronaviruses. All of the 
above indicate that alpha- and betacoronaviruses have different 
circulation and transmission dynamics in bat populations. 
Among the carriers of betacoronaviruses, which are most associ¬ 
ated with emerging human infectious diseases, Rhinolophus spp. 
have been the main hosts found to harbor SARS-like CoVs in 
China and therefore have been considered to be the natural hosts 
of SARS CoVs (11, 12, 29). With the increasing number of SARS- 
like CoVs identified in bats since 2005, the host range of SARS-like 
CoVs has extended from Rhinolophus spp. to Chaerephon spp. in 
China and Hipposideros and Chaerephon spp. in Africa (13-21). 
Most SARS-like CoVs from non-Rhinolophus spp. show far 
greater genetic distance to SARS CoVs than those from Rhiitolo- 
phus spp. This is especially true for viruses from Africa, which 
share less than 83% full genomic identities with SARS CoVs (17, 
19, 21), suggesting that the circulation of SARS-like CoVs is re¬ 
stricted mainly to Rhinolophus spp. but with wide geo-locations. 

Our attempt to amplify the full S gene of SARS-like CoVs from 
positive samples was successful, but amplification of the full S gene 
of alphacoronaviruses failed, possibly due to high sequence diver¬ 
sity as well as the limited sample amount. Instead, a 440-bp highly 
conserved region of the RdRp gene was amplified to construct the 
phylogenetic tree in the present study. This region is useful to 
analyze the diversity although cannot accurately determine the 
evolutionary status of CoVs (20). Using this region, 5 clades of 
alphacoronavirus were identified from 4 of 5 bat species in 3 of the 
4 sampled locations, while betacoronavirus was from only one 
species in a single location (Table 1, Fig. 1), indicating that bats in 
Yunnan have an abundant diversity of CoVs. In the present study, 
SARS-like CoV was detected only in 2 of 14 bats in Baoshan. This 
sample size was too small to permit detection of alphacoronavi¬ 
ruses, but betacoronaviruses were not found in 254 bats from the 
other three locations, which supports the conclusion that there is 
a restricted distribution of betacoronaviruses in the bat popula¬ 
tion. Taken all together, these data show that circulation and 
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transmission dynamics of alpha- and betacoronaviruses in bats 
are different. 

The gene encoding spike protein S is the highly variable region 
within the CoV genome. The S protein consists mainly of SI and 
S2 domains, the former containing RBM (aa 426 to 518) within 
RBD (aa 319 to 518). RBM, which determines the host tropism of 
CoV by binding cell receptor ACE2, is the most variable region (2, 
24, 52). The RBM of SARS CoVs is a unique element which initi¬ 
ates viral infection by specifically binding to the ACE2 receptor of 
human and civet cells. In this process, two critical amino acid 
residues on RBM (479N and 487T) determine the efficiency of 
receptor binding since substitution of both abolishes viral binding 
to human ACE2, thereby abrogating the viral infection (41, 42). 
Substitution of either residue alone, however, has no significant 
impact on human ACE2 binding (24). Of significance is the fact 
that the SI domain of bat SARS-like CoVs reported before 2013 
has a very low nucleotide similarity to that of SARS CoVs (Fig. 4A 
and B), and there are several key deletions and mutations in their 
RBM (Fig. 4C) which distinguish them from SARS CoVs and 
make them incapable of infecting humans and civets via binding 
to ACE2 (11, 12,24-27). In contrast, the LYRa 11 in our study and 
Rs3367 reported recently (29) have high sequence identity with 
the SI domain of SARS CoVs, showing almost exactly the same 
RBM sequence, with a single amino acid substitution among the 
two key sites determining host tropism (Fig. 4A to C). This makes 
Rs3367 able to use human ACE2 for potentially direct human 
infection and to be crossly neutralized by convalescent-phase sera 
of SARS patients (29). This property is probably shared by LYRal 1 
since its SI domain, in addition to having very high sequence 
identify with Rs3367, is efficiently recognized by SARS-convales- 
cent human serum (Fig. 5B). The clear serological and RBM se¬ 
quence evidences show that LYRal 1 is antigenically very close to 
SARS CoV. All results given above strongly suggest that LYRa 11 
and Rs3367 have the potential to directly infect civets and humans 
and, as gap-filling viruses between previously reported bat SARS- 
like and human SARS CoVs, might be deemed progenitors of 
SARS CoVs. In consideration of the 91% full genomic identity 
with Rs3367, lack of ORF4, and its isolation site being >350 km 
from Kunming, where Rs3367 was identified (Fig. IB), the two 
viruses are distinct. It is reasonable to speculate that more 
LYRal 1- or Rs3367-like viruses will be isolated from bats in the 
future. 

Due to their unique mechanism of viral RNA replication, CoVs 
are prone to recombination during double infections (43). Previ¬ 
ous studies have suggested that SARS CoVs were likely recombi¬ 
nants originating from strains Rp3 and Rfl (13, 35), while Rs3367 
recombined from lineages that had evolved into human/civet 
SARS CoV and bat SARS-like CoV Rs672 (29). Our analysis of the 
recombination events among LYRa 11 and other SARS or SARS- 
like CoVs using RBD and SimPlot and the results suggest that 
LYRal 1 is a recombinant descending from lineages that had ulti¬ 
mately evolved into Rs3367 and Yunnan2011, both of which were 
detected in Yunnan Province (16, 29). On this basis, it appears that 
SARS-like CoVs have been circulating in Yunnan bats for a long 
time, with obvious genetic recombination during virus transmis¬ 
sion between bat species. 

Our attempts to isolate infectious virus from the bat rectal 
samples failed, and only a few CoV-like particles were observed 
directly from rectal samples after ultracentrifugation. Reasons for 
believing these to be coronaviruses have been provided in Results, 


although the uncharacteristic morphology of the surface projec¬ 
tions remains to be explained. Only a few petal-shaped spikes were 
observed on the surface of the virions (Fig. 7). Spikes, however, are 
comprised mainly of S1 and S2 domains, which, respectively, form 
the globular portion and the stalk (2). Studies have shown that SI 
is not strongly associated with S2 and is easily detached from the 
virion during excessive freeze-thawing or ultracentrifugation (53- 
55); hence, the observation of only a few intact spikes in our prep¬ 
aration might be ascribed to damage or loss of SI. 

In conclusion, Yunnan is a region with diverse alpha- and be¬ 
tacoronaviruses. Due to the ease of recombination between differ¬ 
ent strains, more diverse bat CoVs are likely to be identified in the 
future in this region, with important public health implications. 
The identification of bat SARS-like CoVs unable to infect human 
and civet before 2013 prompted speculation about the existence of 
SARS-like CoVs able to directly infect human and civets via wild 
animals. This speculation has ended with the identification of 
LYRal 1 and Rs3367, which are gap-filling viruses and likely have 
the ability to directly infect humans. The discovery of LYRal 1, 
together with Rs3367, has provided an important clue to the ori¬ 
gin of SARS CoV from bat SARS-like CoVs and presents the stron¬ 
gest evidence so far that bats are the natural hosts of SARS CoVs. 
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